Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation

نویسندگان

  • Svetlana Kiritchenko
  • Saif Mohammad
چکیده

Rating scales are a widely used method for data annotation; however, they present several challenges, such as difficulty in maintaining interand intra-annotator consistency. Best–worst scaling (BWS) is an alternative method of annotation that is claimed to produce high-quality annotations while keeping the required number of annotations similar to that of rating scales. However, the veracity of this claim has never been systematically established. Here for the first time, we set up an experiment that directly compares the rating scale method with BWS. We show that with the same total number of annotations, BWS produces significantly more reliable results than the rating scale.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Capturing Reliable Fine-Grained Sentiment Associations by Crowdsourcing and Best-Worst Scaling

Access to word–sentiment associations is useful for many applications, including sentiment analysis, stance detection, and linguistic analysis. However, manually assigning finegrained sentiment association scores to words has many challenges with respect to keeping annotations consistent. We apply the annotation technique of Best–Worst Scaling to obtain real-valued sentiment association scores ...

متن کامل

Experimental measurement of preferences in health and healthcare using best-worst scaling: an overview

Best-worst scaling (BWS), also known as maximum-difference scaling, is a multiattribute approach to measuring preferences. BWS aims at the analysis of preferences regarding a set of attributes, their levels or alternatives. It is a stated-preference method based on the assumption that respondents are capable of making judgments regarding the best and the worst (or the most and least important, ...

متن کامل

Psychometric Properties of Pain Intensity Scales in Isfahanian Geriatric Population

Introduction: Given the importance of pain assessment in the older adults, instrumentation for pain measurement is inevitable. The aim of this study is to compare psychometric properties of three commonly used pain intensity scales; (Numeric Rating Scale , Verbal Descriptor Scale (VDS) and, Faces Pain Scale Revised (FPS-R)) in Isfahanian older adults, to identify the most validated and reliable...

متن کامل

Happy Accident: A Sentiment Composition Lexicon for Opposing Polarity Phrases

Sentiment composition is the determining of sentiment of a multi-word linguistic unit, such as a phrase or a sentence, based on its constituents. We focus on sentiment composition in phrases formed by at least one positive and at least one negative word—phrases like happy accident and best winter break. We refer to such phrases as opposing polarity phrases. We manually annotate a collection of ...

متن کامل

Emotion Intensities in Tweets

This paper examines the task of detecting intensity of emotion from text. We create the first datasets of tweets annotated for anger, fear, joy, and sadness intensities. We use a technique called best–worst scaling (BWS) that improves annotation consistency and obtains reliable fine-grained scores. We show that emotion-word hashtags often impact emotion intensity, usually conveying a more inten...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017